Real-time, scalable, content-based Twitter users recommendation
نویسندگان
چکیده
Real-time recommendation of Twitter users based on the content of their profiles is a very challenging task. Traditional IR methods such as TF-IDF fail to handle efficiently large datasets. In this paper we present a scalable approach that allows real time recommendation of users based on their tweets. Our model builds a graph of terms, driven by the fact that users sharing similar interests will share similar terms. We show how this model can be encoded as a compact binary footprint, that allows very fast comparison and ranking, taking full advantage of modern CPU architectures. We validate our approach through an empirical evaluation against the Apache Lucene’s implementation of TF-IDF. We show that our approach is in average two hundred times faster than standard optimized implementation of TF-IDF with a precision of 58 %.
منابع مشابه
Towards an expressive and scalable Twitter profile hash for users recommendation
Microblogging websites such as Twitter produce tremendous amounts of data each second. Identifying people to follow is a heavy task that cannot be completely done by users. Consequently, real time recommendation systems require very efficient algorithm to quickly process this massive amount of data, so as to recommend users having similar interests. In this paper we present a tractable algorith...
متن کاملHashGraph : an expressive and scalable Twitter users profile for recommendation
Microblogging websites such as Twitter produce tremendous amounts of data each second. Identifying people to follow is a heavy task that cannot be completely done by users. Consequently, real time recommendation systems require very efficient algorithm to quickly process this massive amount of data, so as to recommend users having similar interests. In this paper we present a tractable algorith...
متن کاملAutomatic Hashtag Recommendation in Social Networking and Microblogging Platforms Using a Knowledge-Intensive Content-based Approach
In social networking/microblogging environments, #tag is often used for categorizing messages and marking their key points. Also, since some social networks such as twitter apply restrictions on the number of characters in messages, #tags can serve as a useful tool for helping users express their messages. In this paper, a new knowledge-intensive content-based #tag recommendation system is intr...
متن کاملTerms of a Feather: Content-Based News Recommendation and Discovery Using Twitter
User-generated content has dominated the web’s recent growth and today the so-called real-time web provides us with unprecedented access to the real-time opinions, views, and ratings of millions of users. For example, Twitter’s 200m+ users are generating in the region of 1000+ tweets per second. In this work, we propose that this data can be harnessed as a useful source of recommendation knowle...
متن کاملGraphJet: Real-Time Content Recommendations at Twitter
This paper presents GraphJet, a new graph-based system for generating content recommendations at Twitter. As motivation, we trace the evolution of our formulation and approach to the graph recommendation problem, embodied in successive generations of systems. Two trends can be identified: supplementing batch with real-time processing and a broadening of the scope of recommendations from users t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Web Intelligence
دوره 14 شماره
صفحات -
تاریخ انتشار 2016